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' In this paper we study linear programming based approaches to the maximum matching 

problem in the semi-streaming model. The semi-streaming model has been considered as one of 
the models for efficient processing massive graphs. In this model edges are presented sequentially, 
possibly in an adversarial order, and we are only allowed to use a small space. The allowed space 
is near linear in the number of vertices (and sublinear in the number of edges) of the input graph. 

In recent years, there have been several new and exciting results in the semi-streaming model. 
However broad techniques such as linear programming have not been adapted to this model. 
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In this paper we present several techniques to adapt and optimize linear programming based 
approaches in the semi-streaming model. We use the maximum matching problem as a foil to 
demonstrate the effectiveness of adapting such tools in this model and as a consequence we 
improve almost all previous results on the semi-streaming maximum matching problem. We 
also prove new results on interesting variants. 



(N ! 1 Introduction 

Analyzing massive data sets has been one of the key motivations for studying streaming algorithms. 
In the streaming model we have sequential access to the input data and the random accessible mem- 
ory is sublinear in the input size. In recent years, there has been significant progress in analyzing 
distributions in a streaming setting (see for example [25J), but similar progress has been elusive in 
^ \ the context of processing graph data. Massive graphs arise naturally in many disparate domains, 

for example, information retrieval, traffic and billing records of large networks, large scale scientific 
experiments, to name only a few. To help process such large graphs efficiently we need to develop 
techniques that work for broad class of problems. Combinatorial optimization problems provide an 
example of such a class of problems. Moreover in many emerging data analysis applications, large 
graphs are defined based on implicit relationship between objects [UEI]. Subsequently, the goal 
is to find suitable combinatorial structure in this large implicit graph, e.g., maximum 6-matchings 
were considered in [21]. Such edges are often generated through "black box" transducers which 
have ill understood structure (or are based on domain specific information) and are prohibitive to 
store explicitly. Therefore in either case, whether the edges are explicitly provided as input or are 
implicit, it is an useful goal to design algorithms and techniques for graph problems, and in particu- 
lar combinatorial optimization problems, without storing the edges. The reader would immediately 
observe the connection to "in place algorithms" , which also poses the question of solving a problem 
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using as small a space as possible excluding the input. In many massive data settings, or when 
the input is implicitly defined, we are faced with the task of designing in place algorithms with no 
random access or writes to the input. Multipass streaming algorithms seem well placed to answer 
these types of questions; putting together the two threads of graph streaming and of combinatorial 
optimization problems, it is natural to ask: how well can we solve maximum matching problem and 
its variants using small additional (to the input) space where we only make a few passes over the 
(possibly adversarial) list of edges? 

Graph problems were one of the early problems considered in the streaming model, and it was 
shown that even simple problems such as determining the connectedness of a graph requires Q(n) 
space |19j (throughout this paper n will denote the number of vertices and m will denote the 
number of edges). This result holds even if a constant number of passes were allowed. However, 
for problems which are even slightly more involved than connectivity, it is often not clear how to 
solve them in space 0{C e n poly log n), that is, even if we allow the space to be larger than the 
lower bound by a polylogarithmic factor and allow the constants in the algorithm to depend on an 
accuracy parameter < e *C 1. Observe that this space bound is still sublinear in the in the size 
of the input stream (of m edges). 

The semi-streaming model [10} [25] has emerged to be a model of choice in the context of graph 
processing - by allowing 0{C £ n polylogn) space for an input stream of m edges defining a graph 
over n vertices, arriving in any (including adversarial) order. In recent years there have been 
several new results in this semi-streaming model, for example see [10[ fTTj [23l [TJ [HJ [7] . Several of 
these papers address fundamental graph problems including matchings. These papers demonstrate 
a rich multidimensional tradeoff between the quality of the solution, the space required and the 
number of passes over the data (and of course, the running time). Many of these results are likely 
to be used as a building block for other algorithms. Yet, as with many emerging models, it is 
natural to ask: are there broad techniques that can be adapted to the semi-streaming model? 

Our results: In this paper we answer both the questions posed previously in the affirmative. In 
particular we investigate primal-dual based algorithms for solving a subclass of linear programming 
problems on graphs. The maximum weighted matching (MWM) is a classic example of such. Al- 
though augmentation based techniques exist for matching problems in the semi-streaming model, 
they become significantly difficult in the presence of weights (since we need to find shortest aug- 
menting paths to avoid creating negative cycles) and the best previous result for the maximum 
weighted matching problem is a h — e approximation using 0{\) passes [23J for any e > 0. Note 
that the input for weighted problems in the semi-streaming model is a sequence of tuples u>ij)} 
and the weights do not have to be stored. Since the maximum weighted matching problem is one of 
the most celebrated combinatorial optimization problems that can be solved optimally, it is natural 
to ask if we can achieve an efficient approximation scheme, that is, an approximation ratio of (1— s) 
for any e > 0? The use of linear programming relaxation allows us to design such a approximation 
scheme, as well as improve the number of passes. See Table Q] for a summary of the results in 
this paper. We also improve the number of passes for finding the maximum cardinality matching 
(MCM) in bipartite graphs by a significant amount. The technique extends to several variants such 
as the 6-matching problem and matching in general graphs. However, the results for general graphs 
in this paper have unappealing running times, such as n°^\ 

Subsequent Results: The question of designing an FPTAS (an approximation scheme where the 
running time is polynomial in both n, -) for matching problems in general graphs, while using a 
small number of passes was left open in this paper. In a recent article [2] we show that such a result 
is possible using a slightly augmented version of the fractional packing framework of [28] (which 
allows us to control tight sets), based on the ideas of Cunningham-Marsh proof of the laminarity 
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of tight sets [31]. The result in [2] shows that the non-bipartite matching problem reduces (after 
many non-trivial steps, including finding minimum odd cuts in a space efficient manner) to an LP 
which corresponds to bipartite matching using "effective weights" , and uses the result in this paper 
for that part. The main ideas and techniques in [2] are orthogonal to this paper. 
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Table 1: Summary of results: The required time is 0(m poly(-, log n)) for all results, except *. 
The space bounds of results presented elsewhere were not always obvious, and we have omitted 
reporting them. Note n' = min{n, |OPT| log ^} and B = n for the uncapacitated case; otherwise 
B = Y^i h- Please note that the result in [2] is subsequent to this paper and builds on the results 
herein. 

Our Techniques: The matching problem has a rich literature, see [6] I16 [ l20l I24j . as well as fast, 
near linear time approximation algorithms |22[ |2"§] 131?] |2"7] [5]. However, these results use random 
access significantly and do not translate to results in the semi-streaming model, and newer ideas 
were used in [101 l23l 135] [9j EJ E] to achieve results in the semi-streaming model. To improve upon 
the results in these papers, we need new and more powerful techniques. 

In this paper we use the multiplicative- weights update meta-method surveyed in |3j . Over many 
years there has been a significant thrust in designing fast approximation schemes for packing and 
covering type linear programming problems [28], (34] [TH [TT] [12] |3], to name a few. Such a thrust 
has existed even outside of theoretical computer science, see the excellent survey in [3]. The meta- 
method uses the oracle to progressively improve the feasibility of the dual linear program, but uses 
a (guessed) value of the optimal solution. If the oracle does not fail to provide these improvements 
within in a predetermined number of iterations, we are guaranteed an approximately feasible dual 
solution. If the failure of the oracle can be appropriately modified and interpreted to give us 
a feasible primal solution, we can use that to verify the guess of the optimal solution and as 
a consequence have an overall scheme. While the key intuition in this paper can be viewed as 
designing a "streaming separation oracle", it is not clear how to implement (or even define) such 
an oracle. There are a super-linear (in n) number of conditions (constraints, verification of various 
assumptions) involving the input that need to be satisfied (even though the number of variables 
are n) which mandate random access. Designing an efficient separation oracle is not always trivial 
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even without any constraint on space — one of the interesting contributions of our paper is to 
show that semi-streaming algorithms for maximal matching can be bootstrapped to achieve a near- 
optimal matching within a few iterations. However, even if we could design an efficient oracle, the 
overall scheme to obtain a good semi-streaming algorithm faces a number of roadblocks. First, 
the multiplicative update method typically requires super-constant number of iterations (to prove 
feasibility) — this translates to super-constant number of passes. Reducing the number of passes 
to a constant requires that we recursively identify small and critical subgraphs. Second, for the 
weighted variants, it is non-trivial to simultaneously ensure that enough global progress is being 
made per pass, yet the computation in a pass is local (and in small space). Given the fundamental 
nature of these roadblocks, we expect the different ideas developed herein will find use in other 
settings as well. 

Other Related Work: The result in [26], is related but somewhat orthogonal to our discussion 
in this paper. 

Roadmap: We revisit the multiplicative weights update method in Section [2j We then demon- 
strate the simplest possible (but suboptimal in space and the number of passes) application of this 
framework in Section [3l but in this process we develop the basic oracles. We subsequently show 
in Section U] how to (i) improve the space requirement by "simulating" multiple guesses of the 
optimum solution in parallel as well as (ii) reduce the number of passes by "simulating" multiple 
iterations of the multiplicative weights update method in a single pass. We show how to remove 
the dependency on n in Section [5j We finally show some extensions of the maximum matching 
problem in Section [6] which also demonstrate the generality of our approach. 



2 The Multiplicative Weights Update Meta-Method 

In this section, we briefly explain the multiplicative weights update method; we follow the discussion 
presented by Arora, Hazan, and Kale [3j. Suppose that we are given the following LP, its dual LP, 
and a guess of the optimal solution a, where A E M nxm , b £ M n , c € W 71 : 

LP: { ^ f T X ^ ^ „ Dual LP: { . ^ „ 

(_ s.t A 1 x>c, x>0 [ s.t Ay < b, y>0 

The algorithm proceeds along the weak separation framework |18j . Suppose that the optimal 
solution is a. The violation of dual constraint % is A^y — b{. The complementary slackness 
conditions mandate that for an optimal solution the Xj(Ajy — bi) = 0. One way to express the 
complementary slackness conditions into a single condition is to interpret the primal variables 
(which are always maintained as positive) as probabilities, and ask: Is there a vector y which 
satisfies c T y = a, such that the expected dual violation is at most 5 ? The vector y, which is the 
answer to the question, is termed ELS cL dual witness. 

If the answer to the question posed to an oracle is "yes" , and the probabilities were chosen such 
that constraints which had larger violations had larger probability mass; then we have a direction 
in which the feasibility of the dual solution can be improved. The improvement is measured by a 
potential function, which is akin to the notion of dual gap. 

If the answer is "no" (referred to as the failure of the oracle) — then we know that there is no 
"good direction" to improve the solution. This serves as a certificate that the dual LP (with the 
additional constraint that the dual solution is at least a) is not feasible. For example, a feasible 
primal solution which is less than a can be one such certificate. However since we are asking 
questions to the oracle that have an approximation parameter, the certificate is also approximate 
at best. 
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But, note that, neither of the above does not give us a solution to the dual. However we can 
produce a dual solution if we achieve two things, in addition to designing an oracle. 



• First, if we are careful in choosing the probabilities (which is what the multiplicative update 
framework achieves), then we have a way to extract a dual solution which approximately 
satisfies all the dual constraints. In fact the solution will be the average of the dual witnesses 
found, and this average will approximately satisfy the dual constraints. It is easy to see that 
the average satisfies c T y = a. Now in many situations, and for the problems in this paper, 
a simple scaling (multiplying each coordinate by of this average vector by a constant c) can 
ensure dual feasibility and we have a c approximate dual solution. 

• Second, note that the approximation also depends on the appropriate guess of a. Therefore 
we need a way of verifying the guess a. In this paper we will achieve this by creating a 
primal feasible solution which is at most (1 + 0{5))a. Observe that the value of a feasible 
primal solution (minimization) is an upper bound of any feasible dual solution (maximization). 
Therefore we need to focus on the largest guess of a for which the oracle has not failed. 

Since this paper is regarding the application of the multiplicative weights update method in stream- 
ing and not about the framework itself, we refer the reader to the original article of [3] for further 
discussion of the intuition behind the framework. In what follows, we provide a brief review of 
the main definitions and notation (Definition [1]) , the meta-algorithm (Algorithm [I]) , and a restate- 
ment of the main result (Theorem [T]) in [3|. We also require a minor extension (Corollary [2]), and 
therefore we restate the proof of the main result in [3J for the sake of completeness. 

Algorithm 1 The Multiplicative Weights Update Meta-Method [3j 



u\ = 1 for all i € [n]. 
for t = 1 to T do 

Given u', the oracle returns an admissible dual witness y*. Note that y* is not required to 
be feasible. 



4: Let M(i,y*) = A^y* - bi (for all i). 



5: For all i, set u 



t+1 



u\{l + e) M{ ~ % ^l p ifM(i,y')>0 
u{(l - e)- M &y*)/p if M(i, y*) < 



6: end for 

7: Output y = (minj 7T~+iA ) t S*y*- Note, for use in this paper bi > 1. This step is dependent 



on the specific problem. 



Definition 1. The Algorithm^ proceeds in iterations and in iteration t finds a dual witness yt- We 
define M(i,y*) = A.;y* — bi to be the violation for dual constraint i in iteration t. The expected 
violation M(2?*,y*) is the expected value o/M(i,y f ) when choosing i with probability proportional 

to u\, i.e., t M(i, y*). The dual witness y* is defined to be admissible if it satisfies 

M(D*,y*)<<5, c T y'>a, and M(i, y*) € [—£, p] Vi G [n] = {1, . . . , n} 

for parameters of the oracle i and p such that < I < p The parameters £, p will be constants for 
the oracles in this paper; p is called the width parameter of the oracle. The parameters e and T 
depend on p, i, and 5. Note that admissibility does not imply feasibility. 
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Theorem 1 (A slight rewording of Corollary 3 in [3j). Let 5 > be an error parameter and 
e = min{^, |}. Suppose that the oracle returns an admissible solution (See Definition [7]) for T = 
2p \n(n) {i era n ons i n Algorithm^ then for any constraint 1 < i < n we have: (1 — e) Y2t M(* 3 y*) < 
5T + J2t M(X>*, y*). Moreover y is a feasible solution of the Dual LP. 

Proof. We analyze the algorithm using a potential function = Let T| = . We 

assume we have an upper bound > T*. Note that T* and are not used in [3]. In this 
theorem, we use vPj = 1 for all i — which is obvious from the fact that all weights u\ are positive. 
We will use a smaller value of to strengthen the theorem later. 

We rewrite the proof of [3] using T| and ^j. Observe that (1 — e)~ x and (1 + e) x are convex (in 
x) for < e < i. Therefore it follows that 

(1 - e)- x < (1 + ex) for x G [-1, 0] and (1 + ef < (1 + ex) for x £ [0, 1] 

since equality is achieved at the respective endpoints (x = —1,0 for the first fact and x = 0, 1 for 
the second fact). From M(i,y t )/ p € [—1, 1] (notice I < p) and the above facts we have: 

= I>* +1= E «-(i-er M(, ' yt)/p + E «*(i+e) M(4 ' yt)/p 

% i:M(j,y t )<0 i:A/(i,y*)>0 

< X)t4(i + y *)/p) = $* + -X; <4 M (*> y*) = + — E |i M ( 1 ' y ( ) 

i ^ i i 

= $*(1 + eAf (2?*, y *)//?) (Using the definition of M(D*, y 4 )) < $* e ( cM ( I,t .y t )/p). 

Therefore we can conclude that, <3? T+1 < ^ 1 e ( € ^ t = 1 M<yV ,y From the algorithm and the defi- 
nitions of ^j, 

n l(l + e )(E t :M(,, y t)>0 M ( ! .y ! )/p) . (J _ e )-(Et:M(i iy t)<0 ^(*.y*)//») = U J+1 < 

From the definition of T*,\I/j, we get $ x = uJ/Tj. Using $ T+1 /$ 1 in equation ([1]), we get, 
Applying the natural log function and simplifying we get: 

T 

ln(l + e) M(i,y*)-ln(l-e) £ M(», y*) < pin ^ + e £ M(2? 4 , y 4 ). 

t:M(i,y*)>0 i:Af(i,y*)<0 * t=l 

Now ln(l + e) — e(l — e) > 0; we have equality at e = and the first derivative of the left hand 
side with respect to e is positive for e > 0. Likewise (using the derivative, but only over the 
range < e < |) we have ln(l — e) + e(l + e) > 0. Therefore using ln(l + e) > e(l — e) and 
ln(l — e) > — e(l + e) we get: 
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7 ln TT + 5> /(2?t ' yt) - (1 - £) £ M(*, y *) + (i+ e ) X! M ^y') 

i t=l t:M(j,y*)>0 t:M(i,y*)<0 

T 

= (l-e)£M(t,y*)+2e £ Mfty') 

t=l t:M(i,y*)<0 
T 

> (1 - e) ^ M(i, y*) - 2eff (From M(i, y*) > -£) 
t=i 

Selecting T = §f In ff, = 1, and e = min{^,±}, we obtain (1 - e) £ t M(t, y*) < 5T + 

i 

£ t M(P*,y*). Note that ln(n) arises from In |i- Since £ £ t M (*> y*) = M (h t £t y*) and y*) < 

i 

5 for all i, we have M(i, A £ t y') < (1 — e)~ 1 (25) < 45 (dividing both the left and right side of the 
inequality by T) or Aj (h J2 t y*) < 6, + 45. This translates to the fact that y satisfies A»y < h. □ 

We also obtain Corollary [21 since we will assign different initial weights uj on constraints and 
use different values of T,. In Section [5j we eliminate the dependency on ln(n) in Theorem Q] using 
the Corollary. 

Corollary 2. Let 5 > be an error parameter and e = min{ j|, |}. Let T* = ?4/Ej n j an( ^ 
let > T* for all t. Suppose that the oracle returns an admissible solution (See Definition QP 
for T = ^lnmaxj iterations in Algorithm then for any constraint 1 < i < n we have: 

(1 - e ) Et M (^ y*) < 5T + Et M(X>*, y*). Moreover y is a feasible solution of the Dual LP. 



3 Warming Up: log n)-pass Algorithms 

In this section, we provide a (1 — e)-approximation algorithm for bipartite MCM and MWM that 
uses 0(4r log n) passes. We will use the multiplicative weights update method reviewed in Section[2j 
Recall that the method provides a solution the dual problem. We formulate the primal LP (1LP1I 
and ILT3]) to be the dual of the actual LP for MCM and MWM (|LP2l and |LP4|) respectively. Note 
that the edges are undirected in these LP formulations. 

min J2i x i max E(»j)eEfw 

s.t Xi+ Xj >l V(i,j)6£ (LP1) s.t V, : ,:!!,, - I VteV (LP2) 

x, > o Vi e V y«>0 V(i,j)eE 

min max ^(i.])<EE W ijyij 

s.t Si + xj^wy V(i,j)eE (LP3) s.t £i:(i,j)eE% < 1 Wei' (LP4) 

x. ( >0 Vi G V Vij>0 V(i,j)eE 

The integrality gap of lLP2l (and[LP4]) is one, since we have a bipartite graph |31| . We first present 
an algorithm for MCM and then generalize the algorithm for MWM. 



3.1 The Simple Case of MCM 

We apply the multiplicative weights update method |3j with the oracle provided in Algorithm [2j 
Recall that if the oracle does not fail, Algorithm Q] returns a feasible solution for ILP2I after T 
iterations. 
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We also make the observation that: we can compute a maximal matching in one pass in the 
semi-streaming model in 0(m) time and 0(n) space. It is trivial to observe that any maximal 
matching is a 2 approximation to the maximum cardinality matching. 

Algorithm 2 Oracle for ILPli The input is {uj}i£V > a - 

1: Let Xi = y^rru*. Let E vio i ated = {(i,j)\xi + Xj < 1}. 

2: Find a maximal matching S in E v i i ated . Let A = \S\. 
3: if A < 5a then 

4: For each € S, increase x% and Xj by 1. Observe that x is feasible for ILPli 
5: Return x and report failure. 
6: else 

7: Return y^ = a/A for (i,j) € S and yy = otherwise. 
8: end if 



Lemma 3. If A > 5a, the oracle described in Algorithm [1 returns an admissible solution with 
£ = 1 and p = 1/5. 

Proof. Note y^ = a/|»S| for G 5 and y^ = for (i,j) S 1 . Therefore, it is obvious that 

Yl(ij)eE Vij = a - Since the vector c is all 1, we have c T y > a. 

For each edge (i,j) £ S we have Xi + Xj < 1, and therefore we have Yla j)eS Uij( x i + x i) < a - 
Therefore E(ij)e£ 2/y 0^+^) = 52(i,j)eS Vij( x i+ X j) < a - Tnis rewrites to £\ Zj Ylj : (i,j)eE < «• 
Observe that £j x» = a and therefore £^ x i(Sj:(i,j)e,E Vii — 1) < 0. 

Thus M(P*,y) = ^Ei^ M (^y) < < (5 and the solution is admissible. Now M(i,y) = 
Ylj-(ij)eEyij ~ — ~~ 1- Since S 1 is a matching, for every j at most one yjj ^ and moreover 

<l/5 (otherwise the oracle has failed). Therefore — 1 < M(i,y) < 1/5. □ 

Lemma 4. If A < 5a, Algorithm^ returns a feasible solution for \LPl\ with value at most (l+25)a. 

Proof. Consider £ E such that x% + Xj < 1. Since S was maximal, there exists an edge 

in S that is adjacent to either i or j. So X{ or Xj is increased by at least 1 and the constraint 
corresponding to edge is satisfied. For each edge G S, we increase the objective value 

by 2 and \S\ < 5a. Since we started with ^i 2 -* = a > the solution returned has value at most 
(1 + 25)a after the increase. □ 

Theorem 5. For any e < | let T = O(pTogn). Using T + 1 passes and space O(^) and time 
0(ss£) time we can find a (1 — e) approximation to the maximum cardinality matching in bipartite 
graphs. This implies a — e) result for general graphs using the integrality gap results of \14\ [73] /- 

Proof. We use the first pass to compute OPT to within factor 2 — this follows from the fact that 
any maximal matching is a 2 approximation to the maximum cardinality matching. Suppose the 
size of the maximal matching we found is q. We try all possible values of a = (1 + where j > 
and a < 2g(l + |) in parallel. This corresponds to O(^) guesses of a. 

Let 5 = e/12. Therefore e = min{|, i} = e/48 since 1 = 1. Note, the parameters e, e are 
different. We now apply the Algorithm [T] using Algorithm [2] as the oracle. 

Let ao to be the smallest value of a which is above OPT, i.e., ao > OPT > ao/(l+ §)■ Consider 
a < Q!q/(1 + |) 2 < OPT /(I + |). For any such value of a it is impossible that the oracle fails since 
we return a feasible primal solution of value at most (1 + 25) a = (l+e/6)oj < (l + |)OPT/(l + §) < 
OPT. Therefore if we consider the largest value of a for which we do not return a feasible primal 
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solution, that value must satisfy a > ocq/{1 + |) 2 . Let this value be a*. Using Theorem [TJ after 
T iterations we have a feasible dual solution y. Note all 6j = 1 and 5 = e/3. By construction 
Ys(i,j)eEylj = a * f° r every t. Therefore 

E ^>T^>7T^W>(l-^0>(l- £ )OPT 

The time and space bounds follow easily. To find the actual matching: Let e = e'/2 and we 
run the above steps to find a fractional solution. We find T matchings before we return the best 
fractional solution, there are at most m! = 0{nT) non-zero entries in the solution. Focus on the 
graph G' defined by these edges only. The fractional solution of the original graph remains a 
fractional solution in G' . We now have random access to these edges in G' and can find a (1 — s) 
approximation to the best matching contained in these edges (which is at least the same value as 
the fractional solution, we use the integrality of the bipartite matching polytope) in time 0{m') 
using known algorithms \2§\ [2^1 122] . The overall approximation is (1 — e) 2 > (1 — e'). □ 

3.2 Abstracting the Oracle 

The intuition behind the oracle, Algorithm [2] will be used for all the algorithms. Although it is 
not difficult to see that the discussion about the oracle need not be limited to linear programs for 
matching, we do not diverge from that topic in the interest of brevity. In Algorithm [5] we must 
choose a subset S of edges which balances two critical properties: 

Admissibility : Each vertex i is adjacent to at most one edge in S. The weights assigned to the 
edges in S (note, they are identical for a specific iteration t) define the parameters £, p. These 
parameters determine the number of iterations. 

Verification : Focusing on the violations in the primal solution allows us to produce a feasible 
primal solution and verify a. For each violated edge in the primal solution, we pick at 
least one adjacent edge. 

Any maximal matching in E v i i a t e d satisfies both conditions. Since we consider the violated edges 
only, the algorithm is natural. Observe that the multiplicative framework operates on dual vio- 
lations whereas the oracle operated on primal violations. In a sense, the problem of finding the 
maximum matching problem in bipartite graphs reduces to the problem of repeatedly finding max- 
imal matchings in subgraphs defined by primal violations (corresponding to the edges). These 
violations can be easily defined by a simple filtering conditions, for example, does the input edge 
satisfy Xi + xj < 1 using the current solution x of the primal, and can be implemented in the 
semi-streaming model. We now proceed to discuss weighted graphs — observe that weights will 
also arise naturally in the unweighted case as we improve Theorem 

3.3 The Not So Simple Case of MWM 

Note that the input for weighted problems in the semi-streaming model is a sequence of tuples 
{(i, j, Wij)} and the weights do not have to be stored. We can easily compute a maximum matching 
in a single pass using 0(n) space and 0{m) time. It is shown in [10] that we can compute a 1/6 
approximation to the maximum weighted matching in a single pass using 0{n) space and 0{m) 
time. 

In a weighted graph, the verification condition must be strengthened to handle the complications 
introduced by edge weights. In ILP2} if we increase Xi by 1, then all the edges adjacent to i are 
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satisfied. Therefore if only a few primal constraints were violated then we could produce a primal 
feasible solution which is close to a. It is not true in lLP4t we now have to increase X\ by the amount 
of violation. However trying to fix the verification condition by itself does not help, any change 
also has to ensure the admissibility condition and a larger increase in Xi corresponds to larger p. 

Let w(S) = Y^u j)£s w ij denote the total weight for any set of edges S. The oracle will search for 
a set S of edges which satisfy the following: 

Weighted Admissibility of S : There exists a matching S' contained in S such that w(S') = 
ft(w(S)). We use S' to construct a dual witness. Since S' is a matching we will have some 
control over £, p. 

Weighted Verification of S : For each violated edge we pick at least one edge adjacent to 

it whose weight is Q(wij). We need all of S to produce an upper bound of the primal solution. 



Algorithm 3 Oracle for ILP3I 
1: Let Xi = v^" t u]. 

^3 3 

2: Let E vio i ated:k = {(i,j)\xi +xj < Wij,a/2 k < Wij < a/2 k ~ 1 }. 

3: Find a maximal matching Sk in E v i i ate d,k for each k = 1, ■ ■ ■ , [log = O(logn). 

4: Let S = U k S k , A = w(S). 

5: if A < 5a then 

6: For each € S, increase Xi and Xj by 2wij. 
7: Further increase every Xi by Return x and report failure. 
8: else 
9: 5' <- 0. 
10: repeat 

11: Pick a heaviest edge from S and add it to S' 
12: Eliminate all edges adjacent to i or j from S. 
13: until 5 = 

14: Return yij = a/w(S') for G S' and yij = otherwise. 
15: end if 



In order to satisfy the modified conditions, we partition the edges depending on their weights. 
Definition 2. An edge is in tier k if a/2 k < Wij < a/2 k ~ 1 . 

Algorithm [3] is the oracle for ILP3I Before we prove the admissibility and verification conditions we 
prove an useful lemma which is a property of the constraints. 

Lemma 6. Ify satisfies £V WijUij = a then for any weights u', if we have x\ = cmf/Q^' ut j) 

<(ij)eE yij( x i + x ) ~ w ij) 



then M(P*,y) = i {T,(i, j )eEViM + x * ~ «'> 



Proof. From the definition of M(T> , y) we have: 



= E yij( X i+ xt j)- E W ijVij= E yi3( x i+ x j- w ij) 

(i,j)eE (i,j)eE (i.j)eE 

The lemma follows. □ 
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Lemma 7. (Weighted Admissibility.) The matching S' Q S constructed by Algorithm [3] satisfies 
w(S') > w(S)/5. As a consequence, if A > 5a the Algorithm^ returns an admissible dual witness 
with p = | and 1 = 1. 

Proof. Observe that we choose at most one edge incident to i from each tier of weight. Consider 
the matching S' constructed by Algorithm [3l Suppose that (i,j) £ S' is in tier k and consider the 
edges Ai = {(i, j')\(i, j') £ S,j ^ j'} which are eliminated from S by the inclusion of this edge 
(i,j) in S'. Each element in Ai has a lower weight than (i,j); otherwise we would have chosen that 
eliminated edge instead of (the weights cannot be equal since there cannot be any other edge 
from tier k which is incident on i). Therefore the edges in Ai belong to tiers numbered k + 1 or 
larger (since they have a lower weight). The weight of any ignored edge in tier k + q can be upper 
bounded by Wij/2 q ~ 1 . These weights add up to 2wij since we have at most one edge from each tier. 
Therefore the sum of the weights of the edges in Ai, Aj amount to at most Awij. Summing over all 
(i,j) £ S', w(S - S') < 4w(S') and thus w(S') > w(S)/5. 

For the second part of the lemma, observe that yij = a/w(S') < 5a/w(S) = 5a/ A < §. The 
parameter £ remains 1 due to the same reason as in the proof of Lemma El We observe that (since 
yij = 0for (i,j)tS'); 

c r y = Y^Wi j y ij = w ijVij= ^i^Ts 7 ) = = a 

(id) (i,j)es> (ij)es' K J K ; 

Applying Lemmai we immediately get aMiV 1 , y) = -,■)<=£ Vijixl+x^ — Wij). Now if x*+x*-— Wij > 
then y^ = by construction, or in other words, M(P*,y) < < 5. The lemma follows. □ 

Lemma 8. (Weighted Verification.) For every violated edge (i,j) in one of the O(logn) tiers, we 
pick at least one edge in S adjacent to (i,j) whose weight is at least Wij/2. As a consequence, if 
A < 5a then the algorithm returns a feasible primal solution for \LP8\ with value at most (1 + 55) a. 

Proof. Suppose that is violated in tier k. Then, since Sk is a maximal matching, we must 

have chosen at least one edge in which is adjacent to i or j. The weight of that chosen edge 
in Sk has to be at least Wij/2 since the weights of two edges that belong to the same tier differ at 
most by a factor of 2. 

For the second part of the proof we follow the argument in Lemma HI with one change. Suppose 
that Xi + Xj < Wij for an edge £ E. We have two cases, either the edge was chosen in one 
of the tiers (say k) or < 5a/ n. The second case is easier, since we increase each Xi by at least 
5a/n, we definitely satisfy the constraint for (i,j) in this case. 

For the first case, observe that in the first part of the lemma we proved that we selected an 
edge e £ S incident on i or j with weight w e > Wij/2. Therefore we increased cc ' >i or Xj by at least 
2w e > Wij. We satisfy the constraint for in this case as well. Therefore x is feasible. 

For each edge £ S, we increase Yli x i by ^Wij- Therefore over all the edges we increase 
Yl,i x i by 4w(S) = 4A. Since we started with Yli x i = a we have Yli x i = a + < (1 + 45)a 
after we increase Xi based on the edges. We now have an additional increase in Xj which adds 5a 
to Yli x i- The lemma follows. □ 

The rest of the argument is almost identical to MCM and proof of Theorem [5] with four changes: 
(i) p increases to | from | (ii) we need to set 5 = e/30 since the primal feasible solution returned 
is at most (1 + 55)a (iii) we start with the 1/6 approximation provided by [10J which uses 0(n) 
space and (iv) for the final rounding scheme we use the recent result of [5]. The space bound 
increases since the oracle now uses 0(n log n) space (and as before we have O(^) oracles being run 
in parallel). 
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Theorem 9. For any e < ^ inT = 0(-j logn) passes, and 0{^j- + j logn) space we can compute 
a (1 — e) approximation for maximum weighted matching in bipartite graphs. 

4 Reducing the Space Requirement and the Number of Passes 

So far we have not used the fact that we are trying to solve the same LP for different guesses of the 
parameter a. Moreover we have used one pass for each invocation of the oracle. The number of 
passes is equal to the number of iterations plus one; the first pass is used to guess the values of a. In 
this section, we first reduce the space required to manage the multiple guesses of a. Subsequently, 
we reduce the number of passes by executing multiple iterations of the algorithm in one pass - this 
can be viewed as making a "step" which is significantly larger than what is provided by the basic 
analysis in the previous section. We focus on the weighted case. 

4.1 Reducing the Space Requirement 

In what follows we show how to preserve the admissibility condition across different values of the 
guessed parameter a, and run the O(-) guesses (in Theorem [9|) in parallel without increasing the 
space requirement by a factor O(-). The key intuition is that we are trying to find feasible solutions 
for the same instance of ILP4I but different values of the objective function. If in a single iteration 
we make progress for a large value of a then we also make progress for a smaller value of a. 

Observe that the proofs of Theorem [5] and [9] use the largest value of a for which we have not 
produced a feasible primal solution. Suppose that we can prove that we would make the same 
choices for different values of a. Then, when we produce a feasible primal solution for some guess 
of a (the oracle fails), it may be that for a smaller guess of a the oracle does not fail. We can 
continue with the smaller guess of a, as if the larger guess was never made! Therefore we will avoid 
running separate oracles for the different guesses of a and thereby save space. We begin with the 
following definition: 

Definition 3. A sequence y^y 2 ,--- , y* is admissible if all y are admissible when we apply 
y^y 2 , • • • , y* in the given order. 

Lemma 10. Let a, a' be guesses of the optimal solution with a > ol . If a sequence y^y 2 , • • • , y* 
is admissible for a, the sequence is also admissible for a' . 

Proof. Consider running the two copies of the Algorithm [TJ for the values of a and a'. Observe 
that M(i, y) only depends on y and therefore the parameters £, p do not depend on a, a' . Moreover 
the actual weights of the edges do not change and therefore for any vector y if c T y > a then 
c T y > a'. 

Therefore to show admissibility, it suffices to prove that M(D q ,y q ) < 5 for all q < t for the 
smaller value a' assuming that y q satisfied M(D q ,y q ) < 5 for all q < t for the larger value a. We 
prove this using induction. 

Initially u is same for both copies of the algorithm (as described so far, we have used u\ = 1, 
but we will be changing this in the next section). Now M(P 1 ,y 1 ) = „ 1 ± ^ u}M(i, y 1 ) and is 

independent of a. Therefore y 1 is admissible for a' . This proves the base case. 

Suppose that we have proven the hypothesis up to q = k and we apply y 1 , • • • ,y fc to both the 
algorithms corresponding to a and a'. Observe that p, M(z, y) are unchanged and therefore the 
weights u^ +1 is the same for both a and a'. But M(V k+1 , y k+1 ) = ]P. u k+1 M(i, y k+1 ) and 

^3 U 3 
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for all i both algorithms have the same value of 1 k +i U k+1 and M(i,y k+1 ) since these quantities 
are independent of a. Therefore y k+l is also admissible for a'. The lemma follows by induction. □ 

The algorithm: We start with a being the upper bound of the maximum matching. Each time 
the oracle fails, we reduce a by (1 + I) factor while keeping the weights of constraints and {y*} 
fixed. This is possible since the sequence of y remains admissible with the same width parameter. 
The total number of successful iterations remains the same but we need an additional iteration for 
each time the oracle reports failure. However we only have to provision for solving one copy of the 
oracle. 

Algorithm 4 Improved Algorithm for MWM (reducing space). 
1: In one pass, find a 6 approximate maximum matching using [10J and let qo be the weight of 

the matching. 
2: uj = 1 for all i £ [n] and a = 6ao 
3: for t = 1 to T do 

4: Given uj, run the oracle (Algorithm [3]) . 

5: If the oracle failed decrease a by factor (1 + |) and repeat lined! 
6: Let M(i, y*) = A^y* — b%. (y is an admissible dual witness now) 

7 „*+! = / 4(l + z) m ' yt)/p ifM(i,y*)>0 

* \ 4(1 - e )-Mfcy*)/P ifM(i,y*)<0 
8: end for 

9: Output yjq^Ety*- 



We can now show that Theorem [9] holds with space 0(n(T + log n)), but uses T' = T + 0(^) passes. 
Formally, 

Theorem 11. For any e < ^ inT = 0(-^ logn) passes, and 0(n(T + log n)) space we can compute 
a (1 — e) approximation for maximum weighted matching in bipartite graphs. 

4.2 Reducing the number of passes 

Consider the two conditions for the oracle given in the previous section, and for the sake of example, 
consider the cardinality case. Suppose that we just performed an update based on a dual witness 
y. Observe that x* = x(u*) and for the next step, the admissibility condition (M(P , y) < < 5) 
remains satisfied as long as the edges (i,j) in y satisfy x\ + x l - < 1. Therefore as a new approach, 
we do not invoke the oracle again as long as we have such a solution. In other words, we can use the 
same matching returned as a dual witness for multiple iterations or until one of its edges satisfies 
the corresponding primal constraint Xi + Xj > 1 . 

Therefore it appears that we can simulate multiple iterations in a single pass. But if Xi + Xj 
is close to 1 then this idea need not be useful because we may satisfy that edge in a single step. 
Observe that this idea automatically brings up the notion of weights even in the context of MCM. 
The high-level idea for the oracle is similar to the construction in Section[3]- but there are significant 
differences and two major issues arise. 

• First, we cannot use uniform values for the entries of y as in Section [3j even in the setting of 
MCM. Suppose that S contains and (i' where 1 greater than 1 — x,- — Xi. 

If we assign large values to yij and yi'j', it decreases the number of iterations per pass (due 
to normalization the x^ for the matched edges rise quickly, and we satisfy the constraint). If 
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we assign small values to yij and yi'j', it increases the total number of iterations and it may 
also result in inadmissible y, i.e., c T y < a. 

• Second, we have to modify the verification condition in Section [3] so that the condition handles 
the values of — x% — Xj and keep the increase of the solution minimal. For example, (again 
using the cardinality case as an example) increasing the value of Xi and Xj less than 1 in the 
verification step can result in an infeasible primal solution. On the other hand, increasing Xi 
and Xj by 1 can result in a larger approximation factor. 

In what follows, we avoid both the issues by defining the tier of an edge based on the violation 
instead of the edge weight. Moreover we ensure that for different edges the y^ values are 

different — this can be viewed as setting Wijyij proportional to the violation in Therefore 
the accounting for the admissibility and verification conditions are different. 

Definition 4. Define = Wy — x% — Xj to be the (primal) violation of an edge (i,j) € E (the 
edge is not violated if < 0). An edge is in violation- tier k if a/2 k < Vy < a/2 1 . 

Observe that if a > maxjj W{j then Vij < Wij. For any set of edges S, define V(S) = Yl(ij)es v ij- 
Define V{j = m&x{vij/u)ij , 0}. 

The improved oracle is given in Algorithm [5j 

Algorithm 5 Improved Oracle for ILP31 
1: Let Xi = " t u 1 -. 

2: Let E violated:k = {(i,j)\(i,j) is in violation-tier k}. for k = 1, ■ ■ ■ ,K = |~log 2 f] 
3: Find a maximal matching in each E v i i ate ^k- 
4: Let S = U^and Ay = V{S). 
5: if Ay < 5a then 

6: For each G S, increase Xi and Xj by 2vij. 
7: Further increase all Xj by — . Return x and report failure. 
8: else 
9: 5' <- 0. 
10: repeat 

11: Pick a edge (i,j) from S with largest and add it to S' 

12: Eliminate all edges adjacent to i or j from S. 

13: until 5 = 

14: Let A' v = V(S'). 

15: Return y^ = VijOt/A'y for E S' and y^ = otherwise. 
16: end if 



Lemma 12. In Algorithm^ if Ay > 5a then we have a matching y such that j)^E w vyii = a > 
and for all i either M(i, y) = —1 or M(i,y) < ^Vij — 1 where (i,j) G S' . As a consequence if 
Ay > 5a then Algorithm^ returns an admissible solution with 1 = 1 and p = |. 

Proof. The proof is similar to the proof of Lemma [7J except that we will use violations (whereas 
the proof of Lemma [7] used the weights). Observe that since y^ = if (i, j) S' we have 

(i,j)EE (i,j)ES> (iJ)eS' V (iJ)eS' 

Note Yli Xi = a = J2(ij)eE w ijllij (observe a is not changed within an iteration). 
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Now suppose that (i,j) £ S' is in tier k and consider edges adjacent to i. All of them are in tier 
k + 1 or higher and for each tier we have at most two edges (one adjacent to i and one adjacent 
to j) because we pick a maximal matching for each tier. So the total violation of edges that are 
eliminated by is Ylk>=i W=l — ^ v ij- This shows that V(S — S') < AV(S') and therefore 

V(S') > V(S)/5. Hence a/Ay < 5a/Ay < 5/5 (otherwise the oracle has failed). 

Now M(i, y) = j)eEVij — 1- Therefore if i is unmatched in S' we have M(i, y) = — 1. 
Otherwise M(i,y) = yij — 1 = Vija/Ay — 1 < jVij — 1 where € S'. This proves the first part 
of the lemma. 

For the second part observe that c T y = Yl(i j)^E w ijVij = a - Using Lemma U we have 
aM(V l ,y) < ^2(ij)(z E yij( xt i + x ) ~ w ij)- Now ^ x \ + x j ~~ w ij — then = by construc- 
tion. Therefore aM(D , y) < and so M(i,y) < < 5. Finally < tvij < 1 and therefore 
— 1 < M(i,y) < |. The lemma follows. □ 

Lemma 13. If Ay < 5a, then Algorithm^ returns a feasible solution for \LP3\ with value at most 
(l + 5<5)a. 

Proof. The proof follows similar arguments as in the proof of LemmaEl except that we use violations 
in this proof instead of weights (as in the proof of Lemma|8f. We consider the normalized weights x 
as a primal candidate for lLP3l If the oracle fails, we augment x to obtain a feasible primal solution 
with a small increase. 

Suppose that is in violation-tier k. Since Sk was maximal, there exists an edge that is 

adjacent to either i or j. So Xi or Xj is increased by at least a/2 k ~ l and the constraint is satisfied. 
If did not belong to any of the violation-tiers then its violation was less than 5a /n and since 
increased by 5a/ n this constraint is also satisfied. 

For each edge € S, we increase the objective value by 2vij and Ylu j)^s Vi i = ^ v < 
Finally, we increase all Xi by 5a j 'n which increases the objective value by at most 5a. So our primal 
solution has value at most (1 + 55)a. □ 

The next lemma is the central idea in this subsection. Consider running Algorithm HI but with 
Algorithm [5] as the oracle instead of Algorithm [3j Based on Lemmas [T2] and [T3l and Theorem [11] 
we know that in T = 0(4r logn) iterations we will find a (1 — e) approximation of the maximum 
weighted matching. Surprisingly, we will now prove that even if we do not update the witness y 
for | steps, the witness remains admissible! 

Lemma 14. Consider running Algorithm [^J but with Algorithm [3| as the oracle instead of Algo- 
rithm LH // the dual witness y computed by the Algorithm [5]) in iteration t is admissible, then y 
remains admissible for all iterations t + q where q < 1/5. 

Proof. Since y was admissible (in any iteration) — £ < M(i, y) < p. Since y was computed in 
iteration t we know from the proof of Lemma [T2l that Yl(i j)eE w ijyij = a ano ^ y)//? < v\y 
We use vfj to indicate that the fractional violation that was used to determine yij. Note that 
Wij{\ — Vij) = x\ + x l y Also note < Vij < 1 for a violated edge. 

Note that even though we are not updating y across the iterations, u,x are being updated 
(using the same y at every step) and we need to prove M(T> t+q ,y) < 5 for every t + q where 
q<l/5. 

Then by Lemma EJ 

M g (P^,y) = i £ w ijVij ^ -L !i 
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t+9 



and since j)<^E w ijyij = a > ^ * s sufficient to show that — ^- — < 8 for ^ 0. This 

means that we need to focus on the edges in the matching S' only, since all other edges have j/y = 0. 

In the following x t+q refers to the iterations of Algorithm U] using the witness y at every step. 
Note e = 5/4£ (since we will eventually set 8 -C 1, and therefore e <C i) and £ = 1. Note that 



(1-5/4) 5 / 5 1 1 ~ *(l-5/4)«*/ B 

The inequality on the left follows from p = 5/8. Note that we have the 1/(1 — 8/4) s / 5 term because 
we decrease the weight of the unmatched vertices and then renormalize the total weight to a (we are 
inductively assuming that we did not report failure up to iteration t + q — 1). The renormalization 
effectively increases the weight of the matched vertices by the same factor. For any 8 > and 
q < 1/8, we have {l + 8/A) q < {1 + 8 /A) 1 / 5 < e 1 / 4 < 2 and 1/(1 - 5/4) ?5/5 < 1/(1 - <5/4) 1/5 < 1 + 6. 
Therefore, 

4* + *r ± « + x ^-6,%Z * & + ^ ^ + v 

Since 2 v v < 1 + u|. for < v\- < 1 and (x\ + x*) = Wij(l — v\j) we have: 

x t+ q + x t+ q < w ..^ _ + + (5) < + ^ 

x t + < j_|_ x . t +'?_ u ,. . 

This implies that — 3 - < 8 for all y y - / and therefore M g (T> t+q ,y) < 8. Now we can 

claim using using induction that y remains admissible for all q < 1/8 iterations (the inductive 
hypothesis was necessary to ensure that a did not change). The lemma follows. □ 



Algorithm 6 Overall Algorithm for MWM. 
1: In one pass, find a 6 approximate maximum matching using [10J and let ao be the weight of 

the matching. Also ensure ao > Wij for all (i, j) € E. 
2: uj = 1 for all i G [n] and a = 6ao 
3: for t = 1 to T do 

4: Given u\, run the oracle (Algorithm [5]) . 

5: If the oracle failed decrease a by factor (1 + |) and repeat lined! 
6: Let M(i,y*) = A.;y* — 6j. (y is an admissible dual witness now) 

_ w t+i = J uKl + «0 M(i,yl)/5 ifM(i 5 y*)>0 

* { u\{l - e )-M(i,y*)/5 ifM(*,y*)<0 
8: end for 
9: Output 



Theorem 15. Theorem [771 foo/tfe with T = 0(-p log n) (and with T + 1 passes) using Algorithm® 

Proof. We use Lemma Q3] repeatedly. We can compute y 1 (using Algorithm [J] and Algorithm 
as oracle and use it for the next 4 iterations. Observe that Lemma [TH shows that we cannot 
report failure within these i iterations and a cannot change. Repeating the same argument we 
compute the witness only for every i iterations. Observe that the overall algorithm simplifies to 
the description given in Algorithm [6J 

Therefore we have O(^logn) = O(^logn) actual computations of the dual witness, we have 
a (1 — e) approximation, where 5 = e/30. Computation of each y requires a pass. Note that we 
may need to repeat an iteration if the y was not admissible (as in Theorem [TTj) — but this only 
adds O(-) iterations. The space requirement is 0(n(T + logn)) since we need to only remember 
the different y values we computed. □ 
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5 Removing the Dependency on n 



In this section, we present algorithms for MCM and MWM where the number of passes does not 
depend on the number of nodes. In each case we use the Algorithm [6] but we use a subgraph of the 
input graph and apply further analysis to bound the number of iterations T. Moreover, instead of 
starting from an initial state uj = 1 we will start the algorithm with different values of uj. 

We will also need to use the Corollary [2] instead of Theorem [TJ Recall that the number of iter- 
ations of the multiplicative weight update framework is 0(^-(lnmaxj ^j)) where T* = ul/(J2j u j) 

and is the upper bound of T*. Of these, I iterations can be performed in a single pass. In what 
follows, we will reduce or bound the (lnmaxj ^j) term. The key observation we will use in this 

i 

regard is the following Lemma: 

Lemma 16. If uj > w^ for all (i,j) G E then during the execution of Algorithm® X{ < 2uj. 

Proof. As the parameter a is decreased in Algorithm El (because a larger value of a ended up 
returning a primal feasible solution and so we are now decreasing a) the value of x% decreases 
because u remains unchanged but a decreases. Thus it suffices to analyze the case when we do not 
change a. 

We first observe that if x\ > w^j for all (i,j) G E then vertex i is not involved in any violations. 
Then no edge adjacent to i can be chosen in y and we will have M(i,y) = Y^j-(i,j)eEyij ~ 1 = — 
Then we will be setting = u\(l — e) 1 / 5 (based on the subroutine Algorithm [6]) . Moreover 
observe that ^ ■ u* +1 > Yljft ~ e) 1 ^ 5 ^! since for every j we have M(j, y) > —1. Therefore, 

t+i_ _ a<(l~e) 1/5 autjl-e) 1 / 5 _ au\ t 

Therefore Xi can increase only if it is involved in some violation. But then Xi < u>ij < uj. So the 
maximum value x^ can achieve is when it is increased in a single step of Algorithm [6] to above uj. 
The maximum value n- +1 /«- in a step of Algorithm [6] is (l+e) 1 ^'^/ 5 < (1 + e) 1 / 5 . Note that e < ^ 
and therefore (1 + e) 1 / 5 < e 1 / 4 . However we may also be decreasing ^2 i u\; which can decrease by 
a factor (1 — e) 1 / 5 . Therefore Xj, which is the relative contribution of Uj to Ej n « can increase at 
most by a factor of e 1//4 (l — e) _1//5 which is at most 2 for e < ^. The lemma follows. □ 

In Algorithm [6] T • = x\/a. Note a > OPT/6 after the first pass where OPT is the maximum 
weighted matching. Therefore if uj > Wij for all (i,j) G E then using Lemma [TBI < -fjpip and 

— 12 ^opt^ = Q( ~o^^ )i where the last fact follows from the fact that J2j u ) — %OPT since 
each uiij is less than uj,ujj. Setting 5 = e/30 we get a variant of Theorem 1151 as follows: 

Theorem 17. If w. L j < uj for all edges (i,j) G E then for any e < \ in T passes where T = 

0( log ^)pt^ )> and 0(n(T + log n)) space we can compute a (1 — e) approximation for maximum 
weighted matching in bipartite graphs. 

5.1 The Simple Case of MCM 

In this context OPT denotes the size of the maximum cardinality matching in G. Consider the 
Algorithm [7] and the following lemma: 

Lemma 18. Let OPTs denote the size of maximum matching in the subgraph induced by the vertex 
set S C V, then (using the notation of Algorithm^, we have OPT — OPT$ t+2 < %{OPT — OPTs t ). 
This proof does not use bipartiteness. 
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Algorithm 7 A constant pass algorithm for maximum cardinality matching 



Find a maximal matching and find a 2 approximation of OPT. 
Let So be the set of vertices that are matched. 



for t = 1 to 



do 



log(2/e') 
log(3/2) 

Find a maximal matching between St—i and V — St—i. Let T t be the set of vertices in the 
maximal matching. 
S t = St-iUT t . 
end for 

Let G' be a subgraph induced by St- This can be achieved by filtering the stream. 
Run Algorithm [6] on G' 



Proof. Fix an optimal solution in the original graph G and an optimal solution in the subgraph 
induced by St- From the difference of two matchings, we can find OPT — OPTs t vertex disjoint 
augmenting paths, say V. We show that at least \ \P\ vertex disjoint augmenting paths are included 
in the graph induced by S t +2- 

Order the vertex disjoint augmenting paths V arbitrarily. Let i! — i — Z — j — j' be the first 
path (where Z is some sequence of vertices in St). Then £ St and i' ^ j' . In what follows we 
will show the condition C : we have an augmentation path i" — i — Z — j — j" available in St+2 for 
some i" ^ eV-S t and G S t+2 - (Note {i',f} can intersect {i",j"}.) 

If we prove this condition C, then any augmentation path we find can remove at most two 
additional paths in V (since are now unavailable). Therefore we can find at least \\P\ 

augmentations in St+2- This means that OPTg t+2 > OPT$ t + ^(OPT — OPTs t ) and therefore 

OPT - OPT St+2 < OPT - (oPT St + i(OPT - OPT St )^j 

Therefore if we prove this condition C the lemma follows. We now prove the condition C. If both 
were included in the matching in step t + 1 or t + 2 then the condition holds with i" = i' and 
j" = j' since we consider all edges in the induced subgraph. Therefore at least one of them, say i', 
was not included in any of the two maximal matching in steps t + 1, t + 2. This means that i was 
matched to some i\ in step t + 1 and some %2 in step t + 2 with i\ ^ %i- If j' ^ i\ then j or j' was 
matched in step t + 1 since the edge (j,f) was available. If j' was matched then the condition is 
satisfied with i" = i\ and j" = j'. Otherwise j was matched, say to j", in step t+1 and j" ^ i% 
since i is matched to i\ in the same matching. The condition is satisfied with i" = i\. Therefore 
the only remaining case is j' = i±, but then the condition is satisfied with i" = i% and j" = j' . 
Therefore the lemma follows. □ 

If Lemma [18] is repeated as many times as in Algorithm [71 the difference between OPT (the 
size of the optimal matching in G) and the optimal solution in the subgraph G' is at most 
^(logJO/Qogf) _ 2 -iog^ _ e ,j 2 timeg 0pT ( notice that OPT _ OPT So < OPT since we started 
with a 2 approximation). Therefore G' now contains a (1 — e'/2) approximation of the maximum 
matching in G. The size of each maximal matching is 0(\OPT\) and we repeat O(log-), the 
subgraph contains at most 0(\OPT\ log -) vertices. The number of passes to find the subgraph is 
OQog j) since we can find a maximal matching in one pass. Using Theorem 1 171 we have: 

Theorem 19. For any £ < \ Algorithm [?] provides a (1 — e) approximation for the maximum 
cardinality matching problem in bipartite graphs using T = O(^loglogj) passes, and 0(n'(T + 
logn')) space where n' = min{n, |OPT|log^}. This implies a |(1 — e) result for general graphs 
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using the integrality gap results of \14[\15^ - The size of the matching can be computed in 0(n' log n') 
space. 

Observe that to estimate the size we only need to remember G' and the u both of which can 
be done using O(re') space. The oracle (Algorithm [5]) requires 0(n' log re') space. 

5.2 The Not So Simple Case of MWM 

The weighted case is significantly more difficult than the unweighted case. The subgraph will now 
be expressed implicitly using the vertex weights U{ as proxy. In the language of Linear Programming 
this means that, instead of staring from an uniform random sample of the constraints, we will start 
from a weighted sample. Let the maximum weighted matching be Ai. 

Before proceeding further, for the rest of this section we assume that the weights are discrete, 
i.e., Wij € {1, (1 + v), ■ ■ ■ , (1 + v) L } where L = 0(~ log ^) to simplify the analysis for v < 1/6. 
This can be achieved in three steps and a single pass by: (i) using a single pass to find both a | 
approximation of w(M) using the algorithm of [10J , and the maximum weight edge. Denote the 
larger of these two values by w' (which is a lower bound on the weight of A4). (ii) deciding to 
ignore all edges of weight uw'/n and (iii) deciding to multiply all edge weights by n/{vw') and the 
performing the discretization by rounding down. Given any matching in this scaled setting, we 
have a matching in the original setting which is related by a simple scaling factor. Note that the 
discretization of the weights reduces the optimal solution by at most (1 — v) factor. 

Given a discretized set of edges we run Algorithm [8j The Algorithm [8J that computes the 
weights of the vertices is similar to Algorithm but is significantly non-trivial. Let M' denote 
the maximum weight matching in this new discretized setting and its weight be w(Ai'). We ensure 
that: 

CI: £X = ($ ^l)w(M'). 

C2: Let G' = (V, E') be a subgraph that consists of edges such that Wij < uj,Uj. Then, G' 
contains a matching with weight at least (1 — 2>v)w{Ai'). 

Hence, using Theorem 1171 we obtain an (1 — ^-approximation of the maximum matching Ai" in 
G' in 0(Jj log passes. Then using C2 we would have a (1 — 3^)(1 — e) approximation for the 
maximum weight matching Ai' in G' . This corresponds to a (1 — 3v) (1 — e)(l — v) approximation for 
the maximum weight matching Ai in G — the weights in the original graph are scaled differently, 
but the relationship is one-to-one. Setting v = ^ and e = e'/2 we would get a (1— e') approximation 
of Ai (for all e' < \). We now proceed to ensure CI and C2. 

Lemma 20. (Condition CI.) Yli u l = (ft m ^) w(M'). Bipartiteness is not used in this proof. 

Proof. For a vertex i define k(i) to be the maximum k with i E S?. Therefore uj = (1 + v) k<y% \ In 
what follows we will show a charging scheme where we charge u\ to different edges in M.' . Consider 
the edge = (i,j) that caused the inclusion of i to Note that \J\f(i',k')\ < q for all i',k'. 

At least one of i and j must be matched in A4', otherwise A4' is not optimal. Moreover either 
i or j must have an edge adjacent to it in Ai' with weight at least ^(1 + v) k ( % "> (otherwise we can 
remove both those edges and add ei to increase the weight of A4'). Let the edge with the larger 
weight (between the two possible edges in M! adjacent to i, j) be /(ej). We charge /(e^) the value 
uj. Note that /(ej),ej are adjacent. 

Now consider an edge e = (i',f) € M' with w e = (1 + v) k . This collects a charge for any 
vertex % in level k{i) such that |(1 + u) k ^ < (1 + v) k = w e . In each such level k(i), we can have 
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Algorithm 8 A constant pass algorithm for maximum weighted matching. 



1: is in level k if Wij = (1 + v) k 

2: for each level k = 0, 1, ■ ■ ■ , L in parallel do 

3: Find a maximal matching E9. 

4: Let Ck be the set of nodes matched in the maximal matching. 

5: Let Si = C k 

6: for t = 1 to q = 8[^lni] do 

7: Find a maximal matching E l k between C& and V — St. 

8: Let T k be the set of nodes matched in the maximal matching. 

9: S\ +1 = S{\JTl 

10: end for 

11: Let Af(i, k) denote the neighbors of vertex i in U^E^. 

12: end for 

13: Let uj = (1 + v) k for the maximum k with i G St 

14: Let G' = {V,E') where E' = : Wij < uj,u}}. 

15: Run Algorithm [6] on G' with initial weights uj and return its result. 



e = f(ei) for at most 2q + 2 different vertices i since e^, e must be adjacent. If = (i', i) then there 
are at most q + 1 possibilities for i (including i'). This is because either i = i' or i € N{i' ,k(i)) 
and |A/"(i', A;(i))| < g. Counting the that arise from / as well, we know that e = /(e^) for at 
most 2q + 2 vertices z. Therefore the charge on e from vertices i with the largest value of k(i) 

is 2(q + l)2?/; e . From the vertices that are in the immediately lower level, the charge is 4 ^ 1 ^ 1 ^ c . 

Summing over all the levels, the charge is 

4 g + 1R 1 + — - + -—-2 + ■ • • < >-w e 

Summing over all edges in Af, since 4 ^ 1 ^ < ^| In i we have the desired result (we use the fact 
thatg + l<§). □ 

Lemma 21. (Condition C2.) G' contains a matching with weight at least (1 — 3v)w(A4'). 

Proof. We start with Ad' and modify it into a matching T so that T contains only the edges in 
G' . We charge the loss induced by the modification to the edges in A4' U T . We first describe the 
modification procedure: 

1. Initially M = A4' . J- = 0. We will maintain M U J to be a matching. 

2. Pick the edge in M with the highest weight. Let this edge be e = (i, j) with weight w e = 
(1 + u) k in level k. Since E% was a maximal matching, either i or j is in C^. Without loss of 
generality, let it be i. Thus k(i) > k. If k(j) > k then both uj,Uj are at least w e and e G G' . 
We add to J- and remove (i,j) from M. Therefore it suffices to consider k(j) < k and 
j $ S\ then i has at least q neighbors in and M(i, k) = q; since the edge (i, j) was available 
for potential inclusion in the q maximal matching. 

(a) If there is i! G J\f(i,k) that is not matched in M or T. Then k(i') > k and (i,i') G G'. 
Add (i,i') to J 7 and remove (i,j) from M. 

(b) Otherwise all i! G A/"(i, k) is matched in Af or T . If i' is matched in M denote its partner 
to be a(i',M). Otherwise i! is matched in T and denote its partner as a(i' 
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i. If there exists at least q/2 vertices (q is even) in J\f(i,k) which are matched in F, 
then delete (i, j) from M and every (i', a(i', F)) where i' G N(i, k) a red charge of 
\w e . 

ii. If there exists i' G Af{i, k) which is matched in M and its weight w((i',a(i', M))) < 
vw e , then we delete both and (i' ,a(i' , M)) from M and add to T. Note 

G G' . Then collects a green charge of w((i' , a(i', M))). 

iii. Otherwise, there exists at least q/2 vertices in Af(i, k) which are matched in M, let 
this set be Q{. Find the smallest weight edge in M incident on a vertex in Qi, let 
that vertex be io- Delete both and (io, a(io, M)) from M and add (i,io) to J- . 
Each edge (i f , a(i' , M)) where i' £ AA(i, fc) receives a blue charge of |^o- 

Observe that the sets 7V(i, /c) are disjoint for a fixed k. This is because the matched vertices 
St are ruled out from participating in step t + 1 or later. Observe that the edges are added to J- 
in non-increasing order of weight. Moreover, during the execution of the above procedure, at any 
point we have the invariant X: that every edge in T has a weight at least at least as much as the 
heaviest weight edge in M. 

The red charges are collected by edges in T. Consider edge e G T with w e = (1 + v) k . Edge e 
collects a red charge from edge e' if w e i < w e . This is a consequence of the invariant I. Moreover, 
for each k' < k the edge e can collect 2 such red charges for edges e'. This is because the two 
cndpoints can be in Af(i, k') for at most 2 different choices of i (this is a consequence of Af(i, k') 
being disjoint for a fixed k). The charge collected due to edges e' in level k! is 2|(1 + v) k ' . The 

the total red charge collected by e can be bound by Ylk'=o 2^(1 + ^) fc ' < ^(1 + v) k = -jj^We- The 
overall red charge sums to -^w(F). 

The green charges are collected by edges in T . Edge e G J 7 collects a green charge from edge 
e' if w e i < uw e . Moreover, this charge is collected at most once. Therefore the total green charge 
is vw^). 

The blue charges are collected by the edges in M'. Consider edge e G M' with w e = (1 + v) k 
which collect a blue charge when edge e' was the heaviest weight edge in M which was deleted 
from M along with e" . Observe that since we were considering the edges in M in decreasing order 
of weight, w e i > w e . Moreover w e > w e ", otherwise we would have deleted e and charged e" in 
that step. And finally, w e » > vw e >, otherwise we would be in the green case. Therefore we have 
We > vw e i and the edge e is charged at most |u; e / < |w e . 

Let w e > = {l+uf then {l+uf > {l+v) k > v{l+v) k> and we have < k'-k < <l^\- 
The edge e can collect 2 such blue charges for edges e' in level k' (again follows from M(i, k') being 
disjoint). The total blue charge on edge e is at most (^ In ^)2^w e = (^ In ^)w e . The overall blue 

charge sums to at most ^ In ^ w(M). 

Observe that we maintained that w(M') = w(F) + A where A is the total charge. Putting the 
charges together, we have 

4 8 1 

w(M') < wiJ 7 ) H wtJ 7 ) + vw{F) H In -w(M') 

vq vq v 
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Using q = 8 In -] and rearranging we get 

(1 - v)w(M') < (l + j^j- + i^j w(F) < (1 + 2v)w(F) 

This translated to w(F) > ^^ u w(M.') > (1 — ?>v)w(M.') for all v < ~ and the lemma follows. □ 

It takes q = 0((p-) 2 log p-) passes (for this setting of v) and 0(nL) = O( nlo £ f n ) space to find the 
subgraph G' . Therefore (changing variables), we have 

Theorem 22. For any e <\ inT = 0(\ log i) passes, and 0(n(T + -^p)) space we can compute 
a (1 — e) approximation for the maximum weighted matching in bipartite graphs. This translates to 
a |(1 — e) approximation for general graphs using the integrality gap results of fl4\ [75] / . T/ie weight 
can be estimated using O(^logn) space. 

Observe that to estimate the weight we need to compute and store the subgraph G' which can 
be done using 0(j logn) space since we need to remember 0{n) vertices for each of the discretized 
weight levels. If we are only interested in the weight, the computation of the Algorithm [6] only 
needs O(nlogn) space for the oracle and can remember u in space 0(n). 



6 Extensions: the 6-Matching Problem and the Maximum Match- 
ing Problem in General Graphs 

In this section, we present algorithms for the 6-matching problem and MWM in general graphs. 
Both algorithms are based on the idea from Section 14. li In Section I6.1| we present algorithms 
for the capacitated 6-matching problem in bipartite graphs (defined shortly). In Section 16.21 we 
discuss the ^incapacitated 6-matching problem in bipartite graphs. In Section 16.31 we present an 
approximation scheme for maximum weighted matching for general non-bipartite graphs. 



6.1 The Maximum (Capacitated) bipartite 6-Matching Problem 

The stream is a sequence of tuples {(i, j, Cij, Wij)}(ij)eE- We assume are all integers for all 

i,3- 

Definition 5. In the (capacitated) b-matching problem, each vertex i has demand bi and each edge 
(i,j) has capacity c^ and weight Wij. A multiset of edges is a b-matching if the multiplicity of 
each edge (i,j) is at most Cij and i is the endpoint of at most bi edges (counting the multiplicity of 
the edges) in the set. The maximum (capacitated) b-matching problem is to find a b-matching that 
maximizes the total weight of edges (again, accounting the multiplicity of the edges). 

We refer to bi as the capacity of a vertex i and let B = bi be the total capacity of all vertices. 
We assume that we have 0(B) space since the solution can have 0(B) edges in it. ILP5I and ILP6I 
are the primal and dual linear programs with integrality gap one [3T] . Algorithm [9] is the oracle for 

EES 



max ^ 1 :"'■:!)! I 

6 l^f^- * ( LP5 ) S -* T + t L + ^>^ V(t,i)E£ 

3 \ i f \ xi> o y% 

y tJ >0 VfyjeE 



(LP6) 
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Algorithm 9 Oracle for ILP61 



1 . Ij6t X 7 \ — ^ . v — v Uf} QjH-d ].6t Zrj A \ — ^ . v — v Hi 4 <i • 

Let E v i i ated ^ = {(i,j)\xi/bi + xj/bj + Zij/aj < Wij,a/2 k < wtj < a/2 k ~ 1 }. 
Find a maximal b-matching Sk in E v i i ate d,k for each k = !,■■■ , [log(n/5)~|. 
Let S = U k S k ,A = w(S). 
Let dij be the multiplicity of in S. 
if A < 5a then 

For each G S, increase Xi and Xj by 2dijWij and Zij by dijWij. 
Further increase all Xi by 5a./ n. Return x and report failure, 
else 
repeat 

Pick the heaviest edge from S. 
Add to S'. 
Suppose that € Sk- 
for k' = k + 1, k + 2, • • • do 

Reduce multiplicities of edges adjacent to i and j from Sk' by dij in total, 
end for 

Remove from S. 
until 5 = 

Let d\j be the multiplicity of in 5'. 
Return yij = adij /w(S') for G S' and = otherwise, 
end if 



Computing z^: The computation of is also not trivial since we cannot store all values of Uij. 
In one pass, we can count the number of edges and therefore, we know the number of constraints 
in ILP5I Observe that u%j values are identical for all edges that have never been selected for 
S'. So if we remember values only for edges that have been in S', we can compute zy for all 
edges. 

Lemma 23. If A > 5a, Algorithm^ returns an admissible solution y with I = 1 and p = 5/5. 
Proof. We first observe that J2{i,j)eE w ijVij = X)(ij)eS' w ijVij and 

Now observe that (dropping the superscript t for this equation) 

1 



aM{V\y) = Y, x Al. E E 
\ 1 j-(i,i)eB j (i,j)ei 



E y^f + f + f: - E^+ E ^ 

(<,j)eJS V 4 3 y/ \ i (i,j)€B 



(i,j)es 





_ fiz. 




Cij 




_ £ii 








_ £?i 


h 





a (By normalization.) 
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Now if ^ + fi + - w^J > then = 0. Therefore aM(V\ y) < < 5. 

Observe that —1 < M(i,y) < ^pry since Yl j <kj — &i> Likewise —1 < M(ij,y) < ^jrgn since 

Finally observe that for each edge 6 5" we eliminate at most 2djj elements from each 

higher tier k 1 (which means weight is lower). Therefore the total elimination is at most AdijWij 
(same argument as in Lemma[7|) which means w(S — S') < Aw(S'). Therefore w (S) < bw(S') and 
a/w{S') = 5a/A < 5/ p. The lemma follows. □ 

Lemma 24. If A < 5a, a feasible solution for \LPb\ with value at most (1 + 65)a is returned. 

Proof. Suppose the edge (i, j) was considered in one of the tiers. For each (i, j) 6 E with multiplicity 
dij (dij = for £ S), the algorithm does not give a higher value of d{j to because (i,j),i 
or j did not allow so in Sfc. If (i,j) was the problem, that is dij = Cij, we increase Zij by CijWij and 
the primal constraint for is satisfied. Otherwise one of the vertices i, j had bi adjacent edges. 
Suppose it is i. Then we increase Xi by at least bijWij because all the edges adjacent to i contribute 
twice their weight (times the respective multiplicity) and these edges are from tier from k or lower 
(which means weight is at least half of Wij). So the primal constraint is satisfied for (i,j). 

If (i,j) was not considered in one of the tiers then < Sa/n. But then increasing Xi by 
5a /n makes this constraint satisfied as well. Therefore, the returned solution is feasible. For each 
€ S, we increase the objective value by hdijWij = hw{S) < 55a. In addition we increase each 
Xi by 5a j n and the total increase is 65a. □ 

We can now apply Theorem [H using e/36 and the space saving idea of Section 14. ip of slowly 
lowering the guess of a. Note \S'\ = O(B) and |5| = 0(B log n). In this case we do not have 
an easy 0(1) approximation. However we can easily guess a factor n approximation and run the 
algorithm for O(Mogn) guesses of the optimum solution. Observe that the number of iterations 
is only additive in the number of guesses of the optimum solution (see Section [4.ip . Since we only 
need to provision for a single copy of the oracle, we have: 

Theorem 25. For any e < ^, in T = O(-^logn) passes and O(BT) space, Algorithm^ and 
Algorithm[l\ together provide a (1 — e) approximation for the optimum b-matching problem. 

Note that we only find a fractional solution in this case. Also note that the actual space 
requirement depends on as well as 6j. As the minimum value of increases, \S\ decreases 
and therefore we need less space. An extreme case of this is the uncapacitated problem, which we 
discuss next. 



6.2 The Maximum Uncapacitated 6-Matching Problem 

In this section, we present two algorithms for the maximum uncapacitated 6-matching problem: a 
constant pass algorithm that requires space that depends on B and a near linear space algorithm 
that requires the number of passes that depends on logn. The uncapacitated 6-matching problem 
is a special case of the capacitated 6-matching problem where all the capacities are infinite. ILP7I 
and ILP81 are the primal and dual LPs for the uncapacitated 6-matching problem. 

max V : ;. ; . /,•"•,.,.</;., min *£i x i 

s-t bEwxmVv^ 1 yi ( LP7 ) s -* t + fj>^' V(i,j)e£ (LP8) 

Vij>0 V(i,j)eE xi>0 Vi 
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0( log i)-pass 0(B{^2 log i + ^p))-space algorithm: The uncapacitated 6-matching problem 
reduces to the maximum matching problem with 0{B) vertices |32[I3U], Each vertex i with duplicity 
bi becomes b{ vertices h,i 2 , ■ ■ ■ ,ib v Each edge becomes bibj edges (h,ji), (i.1,32), • • • , [k^jbj)- 
Let the resulting graph be G . It is easy to see that any b- matching in G corresponds to a matching 
of the same weight in G' and the converse also holds. Using Algorithm [8] for G', we obtain the 
following result. This transformation preserves bipartiteness. 

Corollary 26 (Theorem 122 j> . For any £ < ^ in T = 0(pr log ~) passes, and 0(B(T + -^p)) space 
we can compute a (1 — e) approximation for the maximum weighted uncapacitated b-matching in 
bipartite graphs. 

n)-pass 0(43-)-space algorithm: Since the uncapacitated problem is a special case of the 
capacitated 6-matching problem, we can apply Algorithm [9l However, the uncapacitated problem 
differs from the capacitated 6-matching problem in that we can always find a maximal 6-matching 
with at most n— 1 (distinct) edges in the former. If we have a solution where j)<^EVij = ^ 
we denote the vertex i to be saturated. Suppose that we are given an edge and neither of 
the vertices i,j are saturated. We can saturate i or j by increasing y^j. This process saturates one 
vertex while increases the number of edges in the 6-matching. As a consequence it gives a maximal 
6-matching of at most n — 1 edges which leads to the following corollary. 

Corollary 27 (Theorem [25]). For any e < | in T = O(pTogn) passes, and 0(nT) space we can 
compute a (1 — e) approximation for the maximum weighted uncapacitated b-matching in bipartite 
graphs. 



(LP10) 



6.3 The Maximum Weight Matching Problem for General Graphs 

Algorithms in Sections [5] achieve (| — e)-approximations because the integrality gap of ILP4I is | 
for general graphs [144 H5|. With additional constraints, we can write a linear program for MWM 
in general graphs with integrality gap one. ILP9I and ILP10I are the primal and dual LP for general 
graphs. In addition to the constraints in MWM, we have a constraint for each odd subset U of V. 
The polytope determined by constraints in ILP9I is the convex hull of all matchings [fH 130] . 

max V-,.,, 1 :"'■,!''.! m[n + £ Z{J 

//„ - •<> V(i,j)€£ •' ' 

(LP9) 

The violation and weight that correspond to the odd-set constraints are 

M{u ' y,) ~ (wM,g/« 

u\j = Yl (1 + e) M( ^ )/p • Yl (l-e) M(C/ ^ )/p 

M(U,y*)>0 A/((7,y t )<0 

Since there are exponentially many odd sets U, we cannot store all weights u\j. Instead, we 
remember non-zero values of y* for all t. Then, we can recompute the values of u\j given U without 
reading data stream. 

There is another problem due to the number of constraints. Since the number of constraints is 
exponential in n, the number of iterations is linear in n where we want the number of iterations is 
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polynomial in log re and K To reduce the number of constraints, we simply ignore all constraints 
corresponding to U with \U\ > 4. Then, the number of constraints is 0(n s ) and therefore, 
the number of iterations is O(^logre). From a feasible solution for the modified formulation, 
we obtain a feasible solution for ILP9I by scaling . If constraints corresponding to vertices are 
satisfied, ^ij^uVij < (1 + £)|_|t^|/2j for |J7| > Therefore, if we scale all yij by factor we 
satisfy all the constraints in ILP9I 



Algorithm 10 Oracle for ILP10 



Keep y*' for all t' = 1, 2, • • ■ , t - 1. 
for each U C V, \U\ is odd do 



Compute u^r where 



*, = (! + ^)(^«m,'')>o 5M (^ t ')) . (1 _ ^(Em.^/xo^W/ 



5: W-(-W + t4;. 

6: end for 

7: Let Xj = pp^*, z ^ = W n c/' ms t ea *d of storing all z;y, z\j is recomputed for each € J7. 

8: Let E vio i atedtk = {(i,j)\xi + + \\u\/2\ Zu < % < w v - 2^1- 

9: Find a maximal matching in E v i i ate< i ^ for each k = 1, ■ ■ ■ , [log(re/<5)] . 

10: Let S = UaA, A = u>(S). 

11: if A < Ja then 

12: For each (z, j) G S, increase Xi and ccj by 2wij. 

13: Further increase Xj by bajn. Return x. 

14: else 

15: repeat 

16: Pick a heaviest edge (i, j) from 5 and add it to S' 

17: Eliminate all edges adjacent to i or j from S 1 . 

18: until S = 

19: Return j/y = a/w(S') for (i, j) € S' and j/jj = otherwise. 

20: end if 



Algorithm [TO] is the oracle for ILP101 It is similar to Algorithm [3j The proof of correctness 
is almost identical to Section [3j We also use the ideas in Section 14.11 which hold in this case as 
well. One drawback of this algorithm is its running time. For each we have to enumerate 

all U that contain There are n 0<yS > subsets that contain For each U, it takes 0(4^) 

time to compute the weight because there are at most O(^) edges for each t'. Therefore, it takes 

0( n b m ) time per pass. We obtain the following theorem: 

Theorem 28. InT = 0(4j logn) passes and 0{ n ° % >m ) time, we can compute a (1 — e) approx- 
imation for maximum weighted matching in general graphs. The algorithm uses O(reT) space. 
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